Interrater Reliability and Agreement of Subjective Judgments

نویسندگان

  • Howard E. A. Tinsley
  • David J. Weiss
  • Joseph L. Fleiss
  • William G. Miller
چکیده

Indexes of interrater reliability and agreement are reviewed and suggestions are made regarding their use in counseling psychology research. The distinction between agreement and reliability is clarified and the relationships between these indexes and the level of measurement and type of replication are discussed. Indexes of interrater reliability appropriate for use with ordinal and interval scales are considered. The intraclass correlation as a measure of interrater reliability is discussed in terms of the treatment of between-raters variance and the appropriateness of reliability estimates based on composite or individual ratings. The advisability of optimal weighting schemes for calculating composite ratings is also considered. Measures of interrater agreement for ordinal and interval scales are described, as are measures of interrater agreement for data at the nominal level of measurement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interrater reliability: the kappa statistic

The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While ther...

متن کامل

An examination of the reliability of a classification algorithm for subgrouping patients with low back pain.

STUDY DESIGN Test-retest design to examine interrater reliability. OBJECTIVE Examine the interrater reliability of individual examination items and a classification decision-making algorithm using physical therapists with varying levels of experience. SUMMARY OF BACKGROUND DATA Classifying patients based on clusters of examination findings has shown promise for improving outcomes. Examining...

متن کامل

The interclinician reliability of Rorschach interpretation in four data sets.

To examine agreement on Rorschach Comprehensive System (CS; Exner, 2004) interpretations, 55 patient protocols were interpreted by 3 to 8 clinicians across 4 data sets on a representative set of 29 characteristics. Substantial reliability was observed across data sets, although a problematic design produced lower results in one. Unexpectedly, a Q-sort task had slightly lower reliability than a ...

متن کامل

The validity of a road test after stroke.

OBJECTIVES To determine the validity of a road test performed by stroke patients in Belgium and to reestablish its reliability. DESIGN Prospective study of a predriving evaluation. SETTING University hospital in Belgium. PARTICIPANTS Thirty-eight patients with sequelae of first-ever stroke. INTERVENTIONS Not applicable. MAIN OUTCOME MEASURES Performance in the Stroke Driver Screening ...

متن کامل

360’ Ratings: an Analysis of Assumptions and a Research Agenda for Evaluating Their Validity

This article argues that assumptions surrounding 360” ratings should be examined; most notably, the assumptions that different rating sources have relatively unique perspectives on performance and multiple rating sources provide incremental validity over the individual sources. Studies generally support the first assumption, although reasons for interrater disagreement across different organiza...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005